21 research outputs found
Principles of Neural Network Architecture Design - Invertibility and Domain Knowledge
Neural networks architectures allow a tremendous variety of design choices. In this work, we study two principles underlying these architectures: First, the design and application of invertible neural networks (INNs). Second, the incorporation of domain knowledge into neural network architectures. After introducing the mathematical foundations of deep learning, we address the invertibility of standard feedforward neural networks from a mathematical perspective. These results serve as a motivation for our proposed invertible residual networks (i-ResNets). This architecture class is then studied in two scenarios: First, we propose ways to use i-ResNets as a normalizing flow and demonstrate the applicability for high-dimensional generative modeling. Second, we study the excessive invariance of common deep image classifiers and discuss consequences for adversarial robustness. We finish with a study of convolutional neural networks for tumor classification based on imaging mass spectrometry (IMS) data. For this application, we propose an adapted architecture guided by our knowledge of the domain of IMS data and show its superior performance on two challenging tumor classification datasets
Robust Hybrid Learning With Expert Augmentation
Hybrid modelling reduces the misspecification of expert models by combining
them with machine learning (ML) components learned from data. Like for many ML
algorithms, hybrid model performance guarantees are limited to the training
distribution. Leveraging the insight that the expert model is usually valid
even outside the training domain, we overcome this limitation by introducing a
hybrid data augmentation strategy termed \textit{expert augmentation}. Based on
a probabilistic formalization of hybrid modelling, we show why expert
augmentation improves generalization. Finally, we validate the practical
benefits of augmented hybrid models on a set of controlled experiments,
modelling dynamical systems described by ordinary and partial differential
equations